Characteristics of multi-layer perceptron models in enhancing degraded speech

نویسندگان

  • Thanh Tung Le
  • John S. D. Mason
  • Tadashi Kitamura
چکیده

SUMMARY A multi-layer perceptron (MLP) acting directly in the time-domain is applied as a speech signal enhancer, and the performance examined in the context of three common classes of degradation, namely low bit-rate CELP degradation ie non-linear system degradation, additive noise, and convolution by a linear system. The investigation focuses on two topics: (i) the innuence of non-linearities within the network and (ii) network topology, comparing single and multiple output structures. The objective is to examine how these characteristics innuence network performance and whether this depends on the class of degradation. Experimental results show the importance of matching the en-hancer to the class of degradation. In the case of the CELP coder the standard MLP with its inherently non-linear characteristics is shown to be consistently better than any equivalent linear structure (up to 3.2 dB compared with 1.6 dB SNR improvement). In contrast, when the degradation is from additive noise, a linear enhancer is always superior.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A TS Fuzzy Model Derived from a Typical Multi-Layer Perceptron

In this paper, we introduce a Takagi-Sugeno (TS) fuzzy model which is derived from a typical Multi-Layer Perceptron Neural Network (MLP NN). At first, it is shown that the considered MLP NN can be interpreted as a variety of TS fuzzy model. It is discussed that the utilized Membership Function (MF) in such TS fuzzy model, despite its flexible structure, has some major restrictions. After modify...

متن کامل

Prediction of the waste stabilization pond performance using linear multiple regression and multi-layer perceptron neural network: a case study of Birjand, Iran

Background: Data mining (DM) is an approach used in extracting valuable information from environmental processes. This research depicts a DM approach used in extracting some information from influent and effluent wastewater characteristic data of a waste stabilization pond (WSP) in Birjand, a city in Eastern Iran. Methods: Multiple regression (MR) and neural network (NN) models were examined u...

متن کامل

Developing a Speech Activity Detection System for the DARPA RATS Program

This paper describes the speech activity detection (SAD) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We present two approaches to SAD, one based on Gaussian mixture models, and one based on multi-la...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEICE Transactions

دوره 78-D  شماره 

صفحات  -

تاریخ انتشار 1994